智能论文笔记

Robustness Implies Privacy in Statistical Estimation

Samuel B. Hopkins , Gautam Kamath , Mahbod Majid , Shyam Narayanan

分类： (统计)机器学习

2022-12-09

We study the relationship between adversarial robustness and differential privacy in high-dimensional algorithmic statistics. We give the first black-box reduction from privacy to robustness which can produce private estimators with optimal tradeoffs among sample complexity, accuracy, and privacy for a wide range of fundamental high-dimensional parameter estimation problems, including mean and covariance estimation. We show that this reduction can be implemented in polynomial time in some important special cases. In particular, using nearly-optimal polynomial-time robust estimators for the mean and covariance of high-dimensional Gaussians which are based on the Sum-of-Squares method, we design the first polynomial-time private estimators for these problems with nearly-optimal samples-accuracy-privacy tradeoffs. Our algorithms are also robust to a constant fraction of adversarially-corrupted samples.

translated by 谷歌翻译

Exponentially Improving the Complexity of Simulating the Weisfeiler-Lehman Test with Graph Neural Networks

Anders Aamand , Justin Y. Chen , Piotr Indyk , Shyam Narayanan , Ronitt Rubinfeld , Nicholas Schiefer , Sandeep Silwal , Tal Wagner

分类：机器学习 | (统计)机器学习

2022-11-06

Recent work shows that the expressive power of Graph Neural Networks (GNNs) in distinguishing non-isomorphic graphs is exactly the same as that of the Weisfeiler-Lehman (WL) graph test. In particular, they show that the WL test can be simulated by GNNs. However, those simulations involve neural networks for the 'combine' function of size polynomial or even exponential in the number of graph nodes $n$, as well as feature vectors of length linear in $n$. We present an improved simulation of the WL test on GNNs with \emph{exponentially} lower complexity. In particular, the neural network implementing the combine function in each node has only a polylogarithmic number of parameters in $n$, and the feature vectors exchanged by the nodes of GNN consists of only $O(\log n)$ bits. We also give logarithmic lower bounds for the feature vector length and the size of the neural networks, showing the (near)-optimality of our construction.

translated by 谷歌翻译

Private High-Dimensional Hypothesis Testing

Shyam Narayanan

分类：机器学习

2022-03-03

我们为高维分布的身份测试提供了改进的差异私有算法。具体来说，对于带有已知协方差$ \ sigma $的$ d $二维高斯分布，我们可以测试该分布是否来自$ \ Mathcal {n}（\ mu^*，\ sigma）$，对于某些固定$ \ mu^** $或从某个$ \ MATHCAL {n}（\ mu，\ sigma）$，总变化距离至少$ \ alpha $ from $ \ mathcal {n}（\ mu^*，\ sigma）$（\ varepsilon），0）$ - 微分隐私，仅使用\ [\ tilde {o} \ left（\ frac {d^{1/2}}} {\ alpha^2} + \ frac {d^{1/3}} {1/3}} { \ alpha^{4/3} \ cdot \ varepsilon^{2/3}}} + \ frac {1} {\ alpha \ cdot \ cdot \ cdot \ varepsilon} \ right）\]唯一\ [\ tilde {o} \ left（\ frac {d^{1/2}}} {\ alpha^2} + \ frac {d^{1/4}} {\ alpha \ alpha \ cdot \ cdot \ cdot \ varepsilon} \ right ）\]用于计算有效算法的样品。我们还提供了一个匹配的下限，表明我们的计算效率低下的算法具有最佳的样品复杂性。我们还将算法扩展到各种相关问题，包括对具有有限但未知协方差的高斯人的平均测试，对$ \ { - 1，1，1 \}^d $的产品分布的均匀性测试以及耐受性测试。我们的结果改善了Canonne等人的先前最佳工作。（\ frac {\ sqrt {d}} {\ alpha^2} \ right）$在许多标准参数设置中。此外，我们的结果表明，令人惊讶的是，可以使用$ d $二维高斯的私人身份测试，可以用少于离散分布的私人身份测试尺寸$ d $ \ cite {actharyasz18}的私人身份测试来完成，以重组猜测〜\ cite {canonnekmuz20}的下限。

translated by 谷歌翻译

Tight and Robust Private Mean Estimation with Few Users

Hossein Esfandiari , Vahab Mirrokni , Shyam Narayanan

分类：机器学习

2021-10-22

在这项工作中，我们在用户级差异隐私下研究高维平均值估计，并设计$（\ varepsilon，\ delta）$ - 使用尽可能少的用户差异化私人机制。特别是，即使用户数量低至$ o（\ frac {1} {\ varepsilon } \ log \ frac {1} {\ delta}）$。有趣的是，这对\ emph {users}的数量绑定到独立于维度（尽管\ emph {samples aper users}的数量被允许以多项式依赖于尺寸），这与先前需要用户数量的工作数量不同。在多项式上依赖于维度。这解决了Amin等人首先提出的问题。此外，我们的机制可抵抗高达$ 49 \％用户的损坏。最后，我们的结果还适用于与少数用户私下学习离散分布的最佳算法，回答Liu等人的问题，以及更广泛的问题，例如随机凸优化和通过差异化的随机梯度优化和随机梯度下降的变体私人平均估计。

translated by 谷歌翻译

Curator: Creating Large-Scale Curated Labelled Datasets using Self-Supervised Learning

Tarun Narayanan , Ajay Krishnan , Anirudh Koul , Siddha Ganju

分类：计算机视觉

2022-12-28

Applying Machine learning to domains like Earth Sciences is impeded by the lack of labeled data, despite a large corpus of raw data available in such domains. For instance, training a wildfire classifier on satellite imagery requires curating a massive and diverse dataset, which is an expensive and time-consuming process that can span from weeks to months. Searching for relevant examples in over 40 petabytes of unlabelled data requires researchers to manually hunt for such images, much like finding a needle in a haystack. We present a no-code end-to-end pipeline, Curator, which dramatically minimizes the time taken to curate an exhaustive labeled dataset. Curator is able to search massive amounts of unlabelled data by combining self-supervision, scalable nearest neighbor search, and active learning to learn and differentiate image representations. The pipeline can also be readily applied to solve problems across different domains. Overall, the pipeline makes it practical for researchers to go from just one reference image to a comprehensive dataset in a diminutive span of time.

translated by 谷歌翻译

Interactive Segmentation of Radiance Fields

Rahul Goel , Dhawal Sirikonda , Saurabh Saini , PJ Narayanan

分类：计算机视觉

2022-12-27

Radiance Fields (RF) are popular to represent casually-captured scenes for new view generation and have been used for applications beyond it. Understanding and manipulating scenes represented as RFs have to naturally follow to facilitate mixed reality on personal spaces. Semantic segmentation of objects in the 3D scene is an important step for that. Prior segmentation efforts using feature distillation show promise but don't scale to complex objects with diverse appearance. We present a framework to interactively segment objects with fine structure. Nearest neighbor feature matching identifies high-confidence regions of the objects using distilled features. Bilateral filtering in a joint spatio-semantic space grows the region to recover accurate segmentation. We show state-of-the-art results of segmenting objects from RFs and compositing them to another scene, changing appearance, etc., moving closer to rich scene manipulation and understanding. Project Page: https://rahul-goel.github.io/isrf/

translated by 谷歌翻译

StyleTRF: Stylizing Tensorial Radiance Fields

Rahul Goel , Sirikonda Dhawal , Saurabh Saini , P. J. Narayanan

分类：计算机视觉

2022-12-19

Stylized view generation of scenes captured casually using a camera has received much attention recently. The geometry and appearance of the scene are typically captured as neural point sets or neural radiance fields in the previous work. An image stylization method is used to stylize the captured appearance by training its network jointly or iteratively with the structure capture network. The state-of-the-art SNeRF method trains the NeRF and stylization network in an alternating manner. These methods have high training time and require joint optimization. In this work, we present StyleTRF, a compact, quick-to-optimize strategy for stylized view generation using TensoRF. The appearance part is fine-tuned using sparse stylized priors of a few views rendered using the TensoRF representation for a few iterations. Our method thus effectively decouples style-adaption from view capture and is much faster than the previous methods. We show state-of-the-art results on several scenes used for this purpose.

translated by 谷歌翻译

A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and Fairness

Tiantian Feng , Rajat Hebbar , Nicholas Mehlman , Xuan Shi , Aditya Kommineni , and Shrikanth Narayanan

分类：机器学习

2022-12-18

Speech-centric machine learning systems have revolutionized many leading domains ranging from transportation and healthcare to education and defense, profoundly changing how people live, work, and interact with each other. However, recent studies have demonstrated that many speech-centric ML systems may need to be considered more trustworthy for broader deployment. Specifically, concerns over privacy breaches, discriminating performance, and vulnerability to adversarial attacks have all been discovered in ML research fields. In order to address the above challenges and risks, a significant number of efforts have been made to ensure these ML systems are trustworthy, especially private, safe, and fair. In this paper, we conduct the first comprehensive survey on speech-centric trustworthy ML topics related to privacy, safety, and fairness. In addition to serving as a summary report for the research community, we point out several promising future research directions to inspire the researchers who wish to explore further in this area.

translated by 谷歌翻译

On Safe and Usable Chatbots for Promoting Voter Participation

Bharath Muppasani , Vishal Pallagani , Kausik Lakkaraju , Shuge Lei , Biplav Srivastava , Brett Robertson , Andrea Hickerson , Vignesh Narayanan

分类：自然语言处理

2022-12-16

Chatbots, or bots for short, are multi-modal collaborative assistants that can help people complete useful tasks. Usually, when chatbots are referenced in connection with elections, they often draw negative reactions due to the fear of mis-information and hacking. Instead, in this paper, we explore how chatbots may be used to promote voter participation in vulnerable segments of society like senior citizens and first-time voters. In particular, we build a system that amplifies official information while personalizing it to users' unique needs transparently. We discuss its design, build prototypes with frequently asked questions (FAQ) election information for two US states that are low on an ease-of-voting scale, and report on its initial evaluation in a focus group. Our approach can be a win-win for voters, election agencies trying to fulfill their mandate and democracy at large.

translated by 谷歌翻译

Optimal Control for Quadruped Locomotion using LTV MPC

Andrew Zheng , Sriram S. K. S Narayanan , Umesh G Vaidya

分类：机器人

2022-12-10

This paper presents a state-of-the-art optimal controller for quadruped locomotion. The robot dynamics is represented using a single rigid body (SRB) model. A linear time-varying model predictive controller (LTV MPC) is proposed by using linearization schemes. Simulation results show that the LTV MPC can execute various gaits, such as trot and crawl, and is capable of tracking desired reference trajectories even under unknown external disturbances. The LTV MPC is implemented as a quadratic program using qpOASES through the CasADi interface at 50 Hz. The proposed MPC can reach up to 1 m/s top speed with an acceleration of 0.5 m/s2 executing a trot gait. The implementation is available at https:// github.com/AndrewZheng-1011/Quad_ConvexMPC

translated by 谷歌翻译